256 research outputs found

    A model for Bioinformatics training : the Marine Biological Laboratory

    Get PDF
    Author Posting. © The Authors, 2010. This is the author's version of the work. It is posted here by permission of Oxford University Press for personal use, not for redistribution. The definitive version was published in Briefings in Bioinformatics 6 (2010): 610-615, doi:10.1093/bib/bbq029.Many areas of science such as biology, medicine, and oceanography are becoming increasingly data-rich and most programs that train scientists do not address informatics techniques or technologies that are necessary for managing and analyzing large amounts of data. Educational resources for scientists in informatics are scarce, yet scientists need the skills and knowledge to work with informaticians and manage graduate students and post-docs in informatics projects. The Marine Biological Laboratory houses a world-renowned library and is involved in a number of informatics projects in the sciences. The MBL has been home to the National Library of Medicine's BioMedical Informatics Course for nearly two decades and is committed to educating scientists and other scholars in informatics. In an innovative, immersive learning experience, Grant Yamashita, a biologist and post-doc at Arizona State University, visited the Science Informatics Group at MBL to learn first hand how informatics is done and how informatics teams work. Hands-on work with developers, systems administrators, librarians, and other scientists provided an invaluable education in informatics and is a model for future science informatics training.This work was supported by the National Science Foundation [0926026 to G.Y., SES-0623176]; Jewett Foundation; Ellison Medical Foundation

    Does Collocation Inform the Impact of Collaboration?

    Get PDF
    Background It has been shown that large interdisciplinary teams working across geography are more likely to be impactful. We asked whether the physical proximity of collaborators remained a strong predictor of the scientific impact of their research as measured by citations of the resulting publications. Methodology/Principal Findings Articles published by Harvard investigators from 1993 to 2003 with at least two authors were identified in the domain of biomedical science. Each collaboration was geocoded to the precise three-dimensional location of its authors. Physical distances between any two coauthors were calculated and associated with corresponding citations. Relationship between distance of coauthors and citations for four author relationships (first-last, first-middle, last-middle, and middle-middle) were investigated at different spatial scales. At all sizes of collaborations (from two authors to dozens of authors), geographical proximity between first and last author is highly informative of impact at the microscale (i.e. within building) and beyond. The mean citation for first-last author relationship decreased as the distance between them increased in less than one km range as well as in the three categorized ranges (in the same building, same city, or different city). Such a trend was not seen in other three author relationships. Conclusions/Significance Despite the positive impact of emerging communication technologies on scientific research, our results provide striking evidence for the role of physical proximity as a predictor of the impact of collaborations.Ewing Marion Kauffman FoundationHarvard University. Office of the Provost (1992-

    Developing and applying heterogeneous phylogenetic models with XRate

    Get PDF
    Modeling sequence evolution on phylogenetic trees is a useful technique in computational biology. Especially powerful are models which take account of the heterogeneous nature of sequence evolution according to the "grammar" of the encoded gene features. However, beyond a modest level of model complexity, manual coding of models becomes prohibitively labor-intensive. We demonstrate, via a set of case studies, the new built-in model-prototyping capabilities of XRate (macros and Scheme extensions). These features allow rapid implementation of phylogenetic models which would have previously been far more labor-intensive. XRate's new capabilities for lineage-specific models, ancestral sequence reconstruction, and improved annotation output are also discussed. XRate's flexible model-specification capabilities and computational efficiency make it well-suited to developing and prototyping phylogenetic grammar models. XRate is available as part of the DART software package: http://biowiki.org/DART .Comment: 34 pages, 3 figures, glossary of XRate model terminolog

    Subgraphs in random networks

    Full text link
    Understanding the subgraph distribution in random networks is important for modelling complex systems. In classic Erdos networks, which exhibit a Poissonian degree distribution, the number of appearances of a subgraph G with n nodes and g edges scales with network size as \mean{G} ~ N^{n-g}. However, many natural networks have a non-Poissonian degree distribution. Here we present approximate equations for the average number of subgraphs in an ensemble of random sparse directed networks, characterized by an arbitrary degree sequence. We find new scaling rules for the commonly occurring case of directed scale-free networks, in which the outgoing degree distribution scales as P(k) ~ k^{-\gamma}. Considering the power exponent of the degree distribution, \gamma, as a control parameter, we show that random networks exhibit transitions between three regimes. In each regime the subgraph number of appearances follows a different scaling law, \mean{G} ~ N^{\alpha}, where \alpha=n-g+s-1 for \gamma<2, \alpha=n-g+s+1-\gamma for 2<\gamma<\gamma_c, and \alpha=n-g for \gamma>\gamma_c, s is the maximal outdegree in the subgraph, and \gamma_c=s+1. We find that certain subgraphs appear much more frequently than in Erdos networks. These results are in very good agreement with numerical simulations. This has implications for detecting network motifs, subgraphs that occur in natural networks significantly more than in their randomized counterparts.Comment: 8 pages, 5 figure

    CORRIE: enzyme sequence annotation with confidence estimates

    Get PDF
    Using a previously developed automated method for enzyme annotation, we report the re-annotation of the ENZYME database and the analysis of local error rates per class. In control experiments, we demonstrate that the method is able to correctly re-annotate 91% of all Enzyme Classification (EC) classes with high coverage (755 out of 827). Only 44 enzyme classes are found to contain false positives, while the remaining 28 enzyme classes are not represented. We also show cases where the re-annotation procedure results in partial overlaps for those few enzyme classes where a certain inconsistency might appear between homologous proteins, mostly due to function specificity. Our results allow the interactive exploration of the EC hierarchy for known enzyme families as well as putative enzyme sequences that may need to be classified within the EC hierarchy. These aspects of our framework have been incorporated into a web-server, called CORRIE, which stands for Correspondence Indicator Estimation and allows the interactive prediction of a functional class for putative enzymes from sequence alone, supported by probabilistic measures in the context of the pre-calculated Correspondence Indicators of known enzymes with the functional classes of the EC hierarchy. The CORRIE server is available at:

    Constructive links between some morphological hierarchies on edge-weighted graphs

    Get PDF
    International audienceIn edge-weighted graphs, we provide a unified presentation of a family of popular morphological hierarchies such as component trees, quasi flat zones, binary partition trees, and hierarchical watersheds. For any hierarchy of this family, we show if (and how) it can be obtained from any other element of the family. In this sense, the main contribution of this paper is the study of all constructive links between these hierarchies

    Brain Radiation Information Data Exchange (BRIDE): Integration of experimental data from low-dose ionising radiation research for pathway discovery

    Get PDF
    Background: The underlying molecular processes representing stress responses to low-dose ionising radiation (LDIR) in mammals are just beginning to be understood. In particular, LDIR effects on the brain and their possible association with neurodegenerative disease are currently being explored using omics technologies. Results: We describe a light-weight approach for the storage, analysis and distribution of relevant LDIR omics datasets. The data integration platform, called BRIDE, contains information from the literature as well as experimental information from transcriptomics and proteomics studies. It deploys a hybrid, distributed solution using both local storage and cloud technology. Conclusions: BRIDE can act as a knowledge broker for LDIR researchers, to facilitate molecular research on the systems biology of LDIR response in mammals. Its flexible design can capture a range of experimental information for genomics, epigenomics, transcriptomics, and proteomics. The data collection is available at:

    On morphological hierarchical representations for image processing and spatial data clustering

    Full text link
    Hierarchical data representations in the context of classi cation and data clustering were put forward during the fties. Recently, hierarchical image representations have gained renewed interest for segmentation purposes. In this paper, we briefly survey fundamental results on hierarchical clustering and then detail recent paradigms developed for the hierarchical representation of images in the framework of mathematical morphology: constrained connectivity and ultrametric watersheds. Constrained connectivity can be viewed as a way to constrain an initial hierarchy in such a way that a set of desired constraints are satis ed. The framework of ultrametric watersheds provides a generic scheme for computing any hierarchical connected clustering, in particular when such a hierarchy is constrained. The suitability of this framework for solving practical problems is illustrated with applications in remote sensing

    Rise and Demise of Bioinformatics? Promise and Progress

    Get PDF
    The field of bioinformatics and computational biology has gone through a number of transformations during the past 15 years, establishing itself as a key component of new biology. This spectacular growth has been challenged by a number of disruptive changes in science and technology. Despite the apparent fatigue of the linguistic use of the term itself, bioinformatics has grown perhaps to a point beyond recognition. We explore both historical aspects and future trends and argue that as the field expands, key questions remain unanswered and acquire new meaning while at the same time the range of applications is widening to cover an ever increasing number of biological disciplines. These trends appear to be pointing to a redefinition of certain objectives, milestones, and possibly the field itself
    • …
    corecore